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ABSTRACT 

The one group posttest only evaluation model has been 
identified as a relatively inexpensive and useful model that can 
identify program components that are not being successful. The use of 
the model is discussed and illustrated through a hypothetical 
evaluation of a compensatory (Chapter 1) program. The one group 
posttest only evaluation model maltes it possible to evaluate a 
compensatory prograun when there is no available comparison group and 
no pretest data are available. The design is valuable for three 
reasons: (1) evaluators seldom, if ever, find a perfect comparison 
group; (2) it can be very difficult to find a test that serves as a 
pretest and can also measure the objectives adequately at posttest; 
and (3) the design allows for identification of the components of the 
program that are not successful, providing guidance for program 
improvement decisions. Two tables and five figures illustrate the 
discussion. (SLD) 
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Problem- One of the purposes of evaluation is to foster usage of 
the evaluation reaults for program Improvement. While most of my 
evaluation colleagues as well as myself often blame administrators 
for not using evaluation results, the problem Is often with the 
evaluation model or process used. Too often the evaluation model 
does not allow for specific recommendations. A good example is the 
evaluation models employed in the evaluation of Chapter 1. 

Possible solutions: The avowed Intent of the Chapter 1 evaluation 
models as developed by RMC and promulgated by the Technical 
Assistance Centers was to determine the effectiveness of the monies 
spent on Chapter 1. The evaluation models were essentially of the 
"objectives-oriented" family, In that they accepted the objectives 
of the program and AlfiyOttd that the program was Implemented as 
described In the proposal that was funded. The evaluation models 
focused on the posttest performance, using the pretest or some 
proxy as an Indicator of where the Chapter 1 students were supposed 
to be at the end of the year. In all fairness to the developers 
of the models, there was no Intent of the models to Identify which 
components of the program were not working or why those components 

were not working. .• i. i 

During the recent years of "search for excellence and school 
effectiveness", the Chapter 1 program office rightfully decided to 
push the Chapter 1 programs and the Technical Assistance Centers 
in the direction of "program Improvement." As already indicated, 
though, the currently available models were not designed to assist 
in this endeavor. The current models do a good job of providing 
the "go - no go" decision for the overall program, but provide no 
hint at all regarding the effectiveness of individual components. 
As a result, Chapter 1 Directors might look in other directions for 
evaluation tools. The accreditation model might be used by 
Directors, wherein they would invite experts to come into their 
program and provide an "expert opinion" as to the merits of the 
various components. The qualifications, experience, and biases of 
the "expert" may have a bearing on the evaluation results. 

The "naturalist" models provide another option, wherein a 
relatively naive observer, using anthropological techniques, would 
spend time observing the project. As a result of being Inundated 
in the project, the observer would then identify the pluses and 
minuses of the project from the point of view of the Obf rvtr. A 
project may receive very different recommendations depending on 
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who th« na1v« p«r»on was and tha dagrea of nalvata. Tha 
naturallatlc mathod alao uaually raquiraa an anormoua amount of 
tima and monay, 

Tha ona group poattaat only avaluatlon modal has baan 
Idantlflad aa a ralatlvaly Inaxpanalva and fruitful modal 
(McNall ,1990a, 1990b). Tha modal can alao Idantify which componanta 
of tha curriculum ara not baing auccaaaf ul . Tha raaaona for thia 
lack of auccaaa would atlll naad to ba Idantlfiad through othar 
avaluatlon procaduraa, but a narrowing procaaa has alraady 
occurrad. 



Mathod: Tha ona group poattaat only daaign can ba utillzad to 
avaluata a companaatory program whan thara la no comparabla 
comparlaon group and whan prataat data do not axlat (Ryan, 1980). 
Tha daaign raquiraa contant apaclallata to Idantify which 
objactlvaa on tha poattaat - wara Includad In tha companaatory 
curriculum (tha C objactlvaa), and which objactlvaa wara Includad 
only In tha ragular curriculum (tha R objactlvaa). Exhibit 1 
provldaa a achamatic rapraaantatlon of a 20 Itam taat with tha R 
and C daalgnatlona. Tha companaatory atudanta ahould parform 
battar on thoaa C objactlvaa to which thay wara axpoaad In both tha 
ragular and tha companaatory program (tha doubla doaing affact), 
than on thoaa R objactlvaa that thay wara axpoaad to only In tha 
ragular curriculum. 

Analyala: Ona could compara tha parcant corract on tha Itama 
maaauring tha two groupa of objactlvaa. Tha analyala would ba a 
almpla t-taat of tha diffaranca batwaan two groupa— ona group baing 
tha C Itama and tha othar group baing tha R Itama, aa Indlcatad In 
Exhibit 1, producing a raault aa In Figura 1. 

It la poaalbla that tha Itama maaauring tha ona group of 
objactlvaa ara of diffarant difficulty than tha 1ta<na that ara 
maaauring tha othar group of objactlvaa. Tha aolutlon to thia 
potantlal dllamma la to atatlatlcally aquata tha difficulty of tha 
Itama by covarying tha Inharant difficulty of tha Itama. Ona could 
uaa tha difficulty Information from althar: 1) tha norming aampla, 
2) tha non-companaatory atudanta In tha aama achool, 3) tha raaulta 
from tha non-companaatory atudanta In tha aama achool In pravloua 
yaara. or 4) tha raaulta from ona or mora LEAa uaing tha almliar 
curriculum and almliar In damographlca. Sinca tha difficulty 
Information la uaad only aa a covarlata, tha adaquacy of tha 
Information la not too crucial. That la, tha additional group la 
only providing Information aa to tha difficulty of Itama on tha 
poattaat and tha group la not baing uaad aa comparlaon group. 
Tha analyala would ba a covarlanca analyala, covarying tha 
difficulty of tha Itama. Tha co/arlata la In tha laat column In 
Exhibit 1, and would produca a raault aa in Figura 2. . ©J 
tha abova analyaaa can ba parformad on all of tha itama in tha taat 
or a aubaat. 
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When there is a desire to use the evaluation information f or 
program improvement, one would want to analyze a specific subset 

of the items, such as: 

items can be reasonably grouped into curriculum units 
: items can be grouped as to first semester or second semester 

. items can be grouped into various ^'^^'^Ti'^'J'^Ln^.d to 
An example of various taxonomic levels will be Presented to 
illustrate the point. In Exhibit 2 the items in Exhibit 1 have 
iiir identified as (1) to which of three different classes 
fSnowing the taxonomy of Bloom (1966). and (2) to which semester 
they were supposed to be taught-first or second. The results in 
nSir: S'clerrly show that the Chapter ^ ^;<^, 
"knowledge" objectives that were in the Chapter 1 and Regular 
prSS^aS than the objectives that were just in the Regular Program. 
r^Stiatical significance can be determined as illustrated in 
McnJi CiS ).) The results in Figure 3 are the kind of resu ts 
?hS?ioud be expected from the Chapter 1 students being double- 
SoJed on the C objectives and only single-dosed on the R 

°^'^**^FigJ?i 4 Indicates less success for the Chapter 1 students on 
"application" objectives. That is. Chapter 1 «<^"<?"\» * "JJi* 
better on "application" objectives when they are Je-dosed than 
when they get only the Regular dose of the "application 

**^^**^FigIl?i 6 indicates that the Chapter 1 students, «nd hence the 
Chapter 1 program, are not successful with "synthesis" objectives. 
Evin though tS Chapter 1 students received instruction on the 
"iJntlirsis" objectives in both the Chapter 1 classroom and the 
rSiClS classroom, they still did not perform any better on thoee 
So2blI-dSs!d objectives than they did on the "synthesis" objectives 
which were only taught in the regular , . . «vo*»cted 

Results such as those in Figure 3 would be expected. 
TaxpayJ?s have paid sxtra money for the double-dosing and therefore 
Jiih?fSlly expect higher performance on thoee objectivee. Results 
iu?h as those in Figure 4 are less exciting and might occur if 
tSSShJ?. don't iJach these "application" objectives a. well as they 
ihould or if Chapter 1 students don't learn thess application 
objiiSivSI as we?l as thsy should. Possibly only a small amount 
Sf Chapter 1 tiS: . spent on theee "application" object ves, while 
the larger part of the Chapter 1 time is spent on the "knowledge 
objectives in Figure 3. _ . . 

Results such as those in Figure 6 are unacceptable and 
explanations such as those offered above need to be found. Pfrhaps 
ShSpS? 1 teachers were not provided enough ineervice on how to 
teach -synthesis" objectives. Perhaps Chapter 1 teachers did not 
hIJS enSXSh time to include all of the material and purpoeefully 
irft outThe Mgher-order skill of "synthssis." 
Chapter 1 students did not receive enough •"PPo^^.^^**^': 
regular claesroom-perhaps they were led to 1^1 i eve t^^^ 
achievers can not be successful on higher-order skills such as 
"synthesis." 
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The specific reason for lack of success would have to be 
Identified through additional evaluation procedures, such as: 

Evaluation of staff development to determine if inservice 
emphasized all aspects of the Chapter 1 curriculum, or 

. Observation of Chapter 1 teachers to determine if the lesson 
Plans allowed for enough time to teach "synthesis", or 

Observation or questioning of Regular teachers to determine 
if Chapter 1 students received encouragement on all kinds of 
objectives in the regular classroom. 

Special concerns: The design rests heavily on the accuracy of the 
curriculum specialists being able to identify those objectives that 
were included in the two curricula. The task can be made a little 
easier by using a criterion-referenced test that has been designed 
to measure the regular curriculum. In such a case, the content 
people only have to identify those objectives that are in the 
compensatory curriculum. ^ . ^. ^. ^ 

In most school systems there is the additional assumption that 
the teachers actually taught the curriculum (and that the students 
listened to and learned from the curricula). The extent to which 
these assumptions are tenable causes problems for all evaluation 
models, but only reduces the likelihood of obtaining significant 
results in favor of the compensatory program in the one group 
posttest only design. 

Potential problems: Since this is a new design, one might wonder 
about whether or not there might be some problems in implementing 
the design. The author successfully implemented this design in a 
Chapter 1 program in Dallas (McNeil, 1990a). Several potential 
problems, though, might be considered. 

Calculations. As with any new evaluation model, ease In 
implementation is a reasonable concern. Analysis I is a «traight- 
forward computation. Analysis II requires an evaluator who 
understand covariance. For those who understand this concept, the 
interpretive value of this analysis far outweighs the additional 
calculation burden. Existing computer packages such as SAS and 
SPSS can easily perform the calculations. 

Aggregation of data. State and Federal evaluators want the 
data to be collapsible across LEAs. If the data are transformed 
to logits, a fairly straight-forward procedure, one should be able 
to aggregate the results. On the other hand, evaluation for 
program Improvement should be oriented to the project, and not to 
the aggregation needs of the Federal government. 

Interpretation of results. The interpretation of results will 
have to rely on usage over time, as did the NCE metric when Jt was 
first Introduced. It should be clear by now that the item level 
interpretations provide insights into J""^^'"' ""J 

teaching modifications that are not available with the current 
Chapter 1 evaluation models. . , 4^ Th^e 

Determination of which curriculum items are i"- /I^® 
determination probably needs to be made by content «P«<^J ^ ? J®- 
rather than by evaluators. The task can be difficult and time 
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consuming. On the other hand, one might argue that the content 
speclaHsts should know both the regular and compensatory curricula 
well enough so that the task would not be that difficult, as was 
the case in the one application. In addition, such determinations 
are usually made when an LEA makes a test adoption decision. I One 
added benefit of this design is that the test adoption decision is 
less crucial for the compensatory program. Those items that are 
not in an LEA's curriculum or in the compensatory curriculum can 
be omitted from the analysis, which is not possible in the RMC 

evaluation models. ) . t* *l. ^i.-^**.. i 

Teacher implementation of curriculum. If the Chapter 1 
teachers do not implement the Chapter 1 program as expected, then 
the analysis will wrongly accuse the Chapter 1 program of be1"g not 
effective. Observation of Chapter 1 teachers could avoid this 

conclusion. , . «.^. < 

Only low difficulty items in the curriculum. A Chapter 1 
curriculum might focus on low-level objectives, but most tests are 
designed such that each objective is measured by items of varying 
difficulty. If indeed the Chapter 1 curriculum is measured only 
by items of low difficulty, then analysis I will lead to an 
incorrect conclusion, but analysis II will still be applicable. 

Testing out of level. Many compensatory students take a lower 
level test, as recommended by the developers of the Chapter 1 
evaluation models (Roberts, 1981 ) . Since the san^ kind of 
curriculum fit determinations can be made with an out of level test 
as with an on level test, testing out of level would not cause a 
problem with the new evaluation model. 

Summary An evaluator may on occasion be confronted with the need 
to produce an evaluation of a compensatory program when there is 
no available comparison group and when no pretest data is 
available. The design discussed in this paper provides a tool for 
obtaining an evaluation under such constraining circumstances, 
without sacrificing any evaluation principals. 

The design is particularly valuable for three reasons. First, 
few, if any, evaluators ever find a perfect comparison group in the 
real world. In this design, the students serve as their control. 
Second, if program gains are evaluated over a school year, which 
they usually are, it may be inappropriate to use the same tes > for 
both pretest and posttest It may be very <1J ^^J 1 

identify a test which adequately measures the objectives desired 
at the posttest and which can be administered at pretest. Finally, 
and most importantly for the this paper, the design allows for the 
Identification of which components of the Chapter 1 program are 
successful and which are not so successful, providing guidance for 
program improvement decisions. 

NOTE: I would like to thank Joe Ryan for initially discussing this 
design, and Napoleon Mitchell, Gail Smith Wayne Murray, Jill am 
Denton, George Powell, James English, and David Vines for forcing 
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iM to hav« a better conceptual 1 : 
want to thank Barbara Mathewe, . 
Identifying the Items and helpl 



Zn Regular In Chapter 1 
Item « Curriculum Curriculum 



1 Y Y 

2 Y Y 

3 Y Y 

4 Y N 

5 Y N 

6 Y N 

7 N N 

8 N Y 

■ 

20 Y Y 



Exhibit 1. Sample design. 
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Posttest 
Item Percent Inherent 



Designation 


Correct 


Difficulty 


C 


.40 


.40 


C 


.78 


.68 


C 


.80 


.85 


R 


.30 


.40 


R 


.08 


.78 


R 


.10 


.20 


OMIT 


.20 


.40 


OMIT 


.50 


.78 


C 


.20 


.16 
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Taxonomi c 8«m««t«r 
It«in # L«v«l Planned 



Poattaat 
It«m Parcant Inharant 

Das I gnat Ion Corract Difficulty 



1 


knowladga 


f 1 rat 


G 




.40 


2 


app1 1 cation 


f irat 


C 


. 78 


. OB 


3 


aynthaala 


aacond 


c 


.80 


.85 


4 


knowladga 


firat 


R 


.30 


.40 


6 


application 


aacond 


R 


.68 


.78 


e 


aynthaala 


firat 


R 


.10 


.20 


7 


application 


firat 


OMIT 


.20 


.40 


8 


aynthaala 


aacond 


OMIT 


.50 


.78 



20 application firat 



.20 .16 



Exhibit 2. Sampla daaign, with program Improvamant application. 
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Figura 1. Schamatic raaulta from analyala I, two group maana. 
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CORRECT 




C ITEMS 
R ITEMS 



.0 .2 .4 .6 .8 

INHERENT DIFFICULTY 

Figure 2. Schematic result* from analysis II, Inherent difficulty 
as cover late. 
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P0STTE8T ^ R ITEMS 

PERCENT 
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KNOWLEDGE 
ITEMS 




.0 .2 .4 .6 .8 
INHERENT DIFFICULTY 
Figure 3. Schematic resulte from analysis II, on Knowledge Items, 
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C ITEMS 
R ITEMS 



INHERENT DIFFICULTY 
Figure 4. Schematic reeulte f rom analyele II, on Application Item*. 
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P08TTE8T 

PERCENT 

CORRECT 

8YNTHE8I8 

ITEMS 




C ITEMS 
R ITEMS 



INHERENT DIFFICULTY 
Figure 6. Schematic rwult* from analysis II, on Synthaalt 1t«ma. 
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